Goto

Collaborating Authors

 binary classifier


Supplementary Material AEvaluation on CIFARBenchmarks

Neural Information Processing Systems

Setup We additionally evaluate GradNorm on a common benchmark with CIFAR-10 and CIFAR100 [22] as ID datasets, which is routinely used in literature [13, 27, 14, 29, 26]. We use the standard split with 50,000 training images and 10,000 test images. The learning rate is initially 0.1, and decays by a factor of 10 at epochs 50, 75 and 90 respectively. Results We summarize the results in Table 6, where GradNormremains competitive. In particular, GradNorm reduces the average FPR95 by 8.77% on CIFAR-10 compared to the best baseline.




Binary Classification from Positive-Confidence Data

Neural Information Processing Systems

Can we learn a binary classifier from only positive data, without any negative data or unlabeled data? We show that if one can equip positive data with confidence (positive-confidence), one can successfully learn a binary classifier, which we name positive-confidence (Pconf) classification. Our work is related to one-class classification which is aimed at describing the positive class by clustering-related methods, but one-class classification does not have the ability to tune hyper-parameters and their aim is not on discriminating positive and negative classes. For the Pconf classification problem, we provide a simple empirical risk minimization framework that is model-independent and optimization-independent. We theoretically establish the consistency and an estimation error bound, and demonstrate the usefulness of the proposed method for training deep neural networks through experiments.


A Proofs

Neural Information Processing Systems

Section A.1 presents the lemmas used to prove the main results. Section A.2 presents the main results The first two inequalities are owing to the triangle inequality, and the third inequality is due to the definition of L-divergence Eq.(5). We complete the proof by applying Lemma A.1 to bound F ollowing the conditions of Theorem 4.1, the upper bound of null V arnull null D Based on the conditions of Theorem 4.1, we assume We complete the proof by applying Lemma A.3 and Lemma A.4 to bound the Rademacher Following the proof of Theorem 4.1, we have |D F ollowing the conditions of Proposition 4.3, as N, we have, null D Based on the result on Proposition 4.3, for any δ (0, 1), we know that 4LB ( 2 D ln 2 + 1)null We complete the proof by applying the triangle inequality. III: Samples from p and q are labeled with 0 and 1, respectively. All values are averaged over five trials.